NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Automated Test Transfer across Android Apps using Large Language Models

https://doi.org/10.1145/3728975

Beyzaei, Benyamin; Talebipour, Saghar; Rafiei, Ghazal; Medvidović, Nenad; Malek, Sam (June 2025, Proceedings of the ACM on Software Engineering)

The pervasiveness of mobile apps in everyday life necessitates robust testing strategies to ensure quality and efficiency, especially through end-to-end usage-based tests for mobile apps' user interfaces (UIs). However, manually creating and maintaining such tests can be costly for developers. Since many apps share similar functionalities beneath diverse UIs, previous works have shown the possibility of transferring UI tests across different apps within the same domain, thereby eliminating the need for writing the tests manually. However, these methods have struggled to accommodate real-world variations, often facing limitations in scenarios where source and target apps are not very similar or fail to accurately transfer test oracles. This paper introduces an innovative technique, LLMigrate, which leverages Large Language Models (LLMs) to efficiently transfer usage-based UI tests across mobile apps. Our experimental evaluation shows LLMigrate can achieve a 97.5% success rate in automated test transfer, reducing the manual effort required to write tests from scratch by 91.1%. This represents an improvement of 9.1% in success rate and 38.2% in effort reduction compared to the best-performing prior technique, setting a new benchmark for automated test transfer.
more » « less
Free, publicly-accessible full text available June 22, 2026
SAIN: A Community-Wide Software Architecture INfrastructure

https://doi.org/10.1109/ICSE-Companion58688.2023.00095

Garcia, Joshua; Mirakhorli, Mehdi; Xiao, Lu; Malek, Sam; Kazman, Rick; Cai, Yuanfang; Medvidović, Nenad (May 2023, IEEE)

Full Text Available
Avgust: automating usage-based test generation from videos of app executions

https://doi.org/10.1145/3540250.3549134

Zhao, Yixue; Talebipour, Saghar; Baral, Kesina; Park, Hyojae; Yee, Leon; Khan, Safwat Ali; Brun, Yuriy; Medvidović, Nenad; Moran, Kevin (November 2022, Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering)

Writing and maintaining UI tests for mobile apps is a time-consuming and tedious task. While decades of research have produced auto- mated approaches for UI test generation, these approaches typically focus on testing for crashes or maximizing code coverage. By contrast, recent research has shown that developers prefer usage-based tests, which center around specific uses of app features, to help support activities such as regression testing. Very few existing techniques support the generation of such tests, as doing so requires automating the difficult task of understanding the semantics of UI screens and user inputs. In this paper, we introduce Avgust, which automates key steps of generating usage-based tests. Avgust uses neural models for image understanding to process video recordings of app uses to synthesize an app-agnostic state-machine encoding of those uses. Then, Avgust uses this encoding to synthesize test cases for a new target app. We evaluate Avgust on 374 videos of common uses of 18 popular apps and show that 69% of the tests Avgust generates successfully execute the desired usage, and that Avgust’s classifiers outperform the state of the art.
more » « less
Identifying casualty changes in software patches

https://doi.org/10.1145/3468264.3468624

Sejfia, Adriana; Zhao, Yixue; Medvidović, Nenad (August 2021, ESEC/FSE 2021: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering)
null (Ed.)
Noise in software patches impacts their understanding, analysis, and use for tasks such as change prediction. Although several approaches have been developed to identify noise in patches, this issue has persisted. An analysis of a dataset of security patches for the Tomcat web server, which we further expanded with security patches from five additional systems, uncovered several kinds of previously unreported noise which we call nonessential casualty changes. These are changes that themselves do not alter the logic of the program but are necessitated by other changes made in the patch. In this paper, we provide a comprehensive taxonomy of casualty changes. We then develop CasCADe, an automated technique for automatically identifying casualty changes. We evaluate CasCADe with several publicly available datasets of patches and tools that focus on them. Our results show that CasCADe is highly accurate, that the kinds of noise it identifies occur relatively commonly in patches, and that removing this noise improves upon the evaluation results of a previously published change-based approach.
more » « less
Full Text Available
Empirically assessing opportunities for prefetching and caching in mobile apps

https://doi.org/10.1145/3238147.3238215

Zhao, Yixue; Wat, Paul; Laser, Marcelo Schmitt; Medvidović, Nenad (January 2018, 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE 2018))

Full Text Available

Search for: All records